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STATISTICAL  MODELS  FOR  TIME  SERIES  AMD  LIFE  TESTING 


WITH  APPLICATIONS  IN  ENGINEERING  SYSTEMS 


^OVERVIEW  OF  RESEARCH 

Over  the  period  of  this  grant  34  papers  have  been  published, 

5 papers  have  been  accepted  for  publication  and  36  technical 
reports  have  been  written,  one  book  has  been  written  and  another 
revised,  all  of  these  have  received  support  from  the  grant.  Many 
of  the  publications  have  reflected  our  continuing  interest  in  the 
Jppics  of  stochastic  systems  (especially  with  discrete  control)  in 
model  building,  smoothing  and  curve  estimation,  methods  of  approximation 


with  noisy  data,  inferences  with  censored  data,  and  life  testing  and 
reliability  with  applications  to  systems re  at  i s c lasc t- , 

r 


Time  Series  Models 


Today  much  more  efficient  system  control,  surveillance,  and 
prediction  are  possible  because  of  more  effective  instrumentation, 
producing  increased  capability  for  continuous  or  rapid  intermittent 
measurement.  However,  for  the  analysis  of  the  resulting  data, 
classical  statistical  methodology  in  which  observations  are  assumed 
to  be  statistically  independent  are  inappropriate.  It  has,  therefore, 
been  necessary  to  develop  new  techniques  of  stochastic  model  building 
for  data  analysis,  forecasting,  and  stochastic  control.  Such  methods 
have  wide  applicability  to  practical  problems  ranging  from  missile 
tracking  to  estimating  the  effect  of  a change  in  Air  Force  recruiting 
policy.  Research  in  this  contract  period  has  led  to  deeper  under- 
standing and  further  development  of  appropriate  model  forms,  to 
improved  methods  of  model  identification,  more  exact  methods  of 
estimation  of  parameters  in  stochastic  models,  and  better  methods  of 
diagnostic  checking  of  the  fitted  models. 
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New  statistical  methods  for  solving  a number  of  other  important 
practical  problems  in  data  analysis  have  been  developed.  A series  of 
results  in  curve  fittinq  and  smoothing  using  splines  provided  a 
practical  nonparametric  method  for  smoothing  noisy  data,  and  provided 
a practical  solution  to  the  notoriously  difficult  problem  of 
differentiating  noisy  data.  Surface  data  can  also  be  smoothed  using 
these  newly  developed  techniques.  Recent  work  in  regression  has 
provided  a technique  whereby  the  user  can  choose  between  subset 
selection,  ridge  or  principal  components  estimates  for  modelling  a 
regression  problem.  This  method  can  be  used  in  some  situations  even 
where  the  number  of  variables  exceeds  the  number  of  data  points. 


Remote  sensing  experiments  typically  lead  to  a data  analysis 
problem  requiring  the  approximate  solution  of  a Fredholm  integral 
equation  of  the  first  kind;  where  the  data  are  noisy.  It  is 
notoriously  difficult  to  obtain  a good  estimate  to  the  true  solution 
to  these  equations,  and  ad  hoc  methods  are  frequently  employed. 
Research  under  this  grant  has  led  to  the  first  practical,  completely 
automatic  method,  with  proven  good  properties,  for  obtaining  an 
estimate  of  the  solution. 

Several  other  hitherto  unsolved  problems  were  solved  with  the 
support  of  this  grant.  Practical  methods  for  estimating  the  optimum 
bandwidth  parameter  for  spectral  density  and  density  estimates  were 
obtained,  thus  allowing  for  the  elimination  of  commonly  used 
subjective  methods  for  doing  this.  The  problem  of  the  optimum  input 
for  system  identification  of  the  stationary  linear  dynamic  model,  was 
solved.  Advances  in  goodness-of-fit  tests  and  the  theory  of  optimal 
control  were  also  made. 
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Life  Testing  and  Reliability 

Research  supported  by  this  grant  during  the  past  four  years  has 
produced  major  advancements  in  several  challenging  areas  of  system 
reliability  modeling  and  statistical  inferences  witli  life  testing 
experiments.  The  primary  goal  of  our  research  program  has  been 
steadily  directed  to  the  needs  of  modern  day  technology  where  the 
progressive  complexity  of  system  structures  demand  a more  sophisticated 
formulation  of  models.  The  models  must  adequately  describe  the 
chance  failure  mechanisms  in  order  to  reach  implications  regarding 
the  system's  successful  completion  of  mission.  Included  in  this  broad 
perspective  is  our  sustained  effort  to  strengthen  statistical  theory 
and  techniques  for  careful  reliability  analyses  that  are  indispensable 
in  preventing  costly  failures  in  complicated  systems. 

A dominant  concentration  of  our  research  activity  was  aimed  at 
improving  models  for  system  realiability  so  as  to  incorporate  possible 
interaction  between  components  and  the  uncertainties  in  the  external 
stresses.  Substantial  progress  was  achieved  in  developing  statistical 
analyses  for  reliability  structures  that  extended  far  beyond  the 
simplistic  models  treated  in  contemporary  literature.  Research 
contributions  in  this  direction  are  of  significant  value  because 
improved  modeling  is  at  the  very  root  of  success  in  implementing 
programs  of  maintenance  and  ascertaining  system  availability. 

Another  area  of  major  achievement  within  our  research  program 
concerns  optimal  statistical  procedures  for  dealing  with  studies  of 
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equipment  failures  where  practical  constraints  of  cost  and  time 
necessitate  an  early  termination  of  the  experiments.  Statistical 
inferences  with  censored  data  were  intensively  investiqated  during 
this  period.  Several  remarkable  results  enriched  this  important 
theoretical  area  of  statistics  and  at  the  same  time  had  direct 
bearing  on  the  analysis  of  censored  life  testing  data. 

Techniques  for  the  statistical  control  of  manufacturing 
processes  are  invaluable  for  maintaining  high  quality  production. 

In  our  study  of  the  very  popular  technique,  called  a cusum  test,  we 
have  alerted  potential  users  of  a serious  shortcoming.  When  serial 
correlation  is  present,  as  is  commonly  the  case,  the  technique  may  weV 
fail  to  protect  the  designated  level  of  quality.  In  addition,  our 
approximations  to  the  test  and  its  properties  provide  the  necessary 
alterations  to  surmount  these  difficulties. 

Moreover,  our  fruitful  method  of  analysis  also  leads  to  novel 
techniques  for  the  sequential  detection  of  time  series  model  changes. 
These  include  statistical  methods  for  continuously  monitoring  critical 
system  parameters  for  time  dependent  models.  These  techniques  and 
their  extensions  provide  quantitative  tools  for  maintaining  complex 
systems  to  run  at  top  performance. 

Most  importantly,  our  research  activity  was  considerably  broader 
in  scope  than  the  major  goals  stated  above.  It  extended  into  several 
important  general  areas  of  statistical  theory.  Significant  contributions 
were  made  in  large  sample  theory,  nonparametric  methods,  quality  control. 
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multivariate  analysis  and  inferences  in  stochastic  processes.  Aside 
from  their  role  in  enhancing  knowledge  in  the  domain  of  statistical 
theory,  many  of  these  results  have  potential  applications  in 
diverse  areas  of  scientific  investigation. 
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DESCRIPTION  OF  RESEARCH  ACTIVITIES 

Our  research  efforts  have  resulted  in  several  innovative 
advances  in  modeling  and  inferences,  and  we  now  describe  these  in 
some  detail . 

Part  1 - Time  Series  Models 


Further  Development  of  Practical  Stochastic  Models 


f 
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In  [l]  the  question  is  addressed  "How  are  stochastic  difference 
equation  time  series  models  related  to  more  traditional  deterministic 
models — how  may  imbedded  deterministic  components  be  recognized?" 

In  [2]  the  relation  of  linear  stochastic  difference  equation 
models  to  exponential  smoothing  methods  is  elucidated.  In  [3]  the 
effect  of  non-Normal  innovations  is  considered.  Research  on  the 
detection  and  tracking  of  parameter  changes  is  discussed  in  [4]  and 
[5]  considers  the  appropriate  choice  of  sampling  interval. 

In  [6]  the  relation  between  parametric  time  series  methods  and  the 
"state  variable"  approach  are  considered.  In  [7]  the  problem  is 
addressed  of  how  best  to  control  a system  when  the  allowable 
variation  in  the  manipulated  variable  is  constrained.  In  [8]  the 
problem  of  optimum  feed  forward  control  is  explored.  Particularly 
after  "linearization",  models  are  liable  to  severe  inhomogeneity  of 
variance  which  if  ignored  can  result  in  inefficient  estimation. 
Frequently  the  difficulty  can  be  corrected  by  an  appropriate 
transformation.  Methods  for  estimating  the  transformation  are 
derived  in  [9]. 
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Stochastic  Model  Building 

Model  Building  is  an  iterative  process  vjhich  may  be  conveniently 
divided  into  three  stages--(i)  Identification  (or  specification)  of 
the  model  form  (autocorrelation  and  cross  correlation  analysis  are  of 
special  importance  at  this  stage);  (ii)  Estiniution  of  system  parameteis 
for  the  tentatively  identified  model  (for  example,  by  maximum  likelihood 
methods);  (iii)  Diagnostic  checking  of  adequacy  of  the  fitted  model 
(by  study  of  residuals).  During  this  grant  period  advances  have  been 
made  in  all  three  areas.  At  the  identification  stage  a deeper 
understanding  of  the  nature  of  partial  autocorrelation  was  obtained 
by  considering  their  role  in  a Bayesian  context  [10].  In  earlier 
work,  system  parameters  v;ere  estimated  using  approximate  maximum 
likelihood.  Improved  exact  likelihood  estimation  was  developed  in 
[irj.  In  practice  data  often  contains  outliers  or  "bad"  values.  In 
[12]  and  [13]  estimation  of  parameters  in  the  presence  of  bad  values 
is  explored.  Operating  systems  frequently  employ  feed  back  loops. 
Special  methods  needed  in  the  analysis  of  the  resulting  data  were 
studied  in  [14,  15,  16].  A portmanteau  test  statistic  for  detecting 
lack  of  model  fit  using  the  autocorrelation  of  residuals  was  earlier 
developed.  A much  closer  and  more  useful  approximation  to  the 
distribution  of  this  statistic  has  now  been  devised  [17].  It  is 
important  that  major  developments  be  brought  to  the  attention  of 
the  audience  who  can  put  them  to  use.  Therefore,  sumiuary  accounts  of 
recent  developments  have  been  prepared  and  published  [18,  19]. 
Developments  in  the  analysis  of  multiple  time  series  and  in 
intervention  analysis  were  described  in  [20].  During  the  grant 
period  a book  on  Bayesian  Inference  [21]  has  been  published,  research 
for  which  was  partly  sponsored  by  the  grant.  Also  a revised 
edition  of  a very  successful  book  in  Time  Series  forecasting  and 
J.a9T  mihiichori.  athftif..  gftsiiUs.  appear 
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. I Curve  and  Surface  Smoothing,  Numerical  Differentiation  of  Noisy  Data  j 

The  problem  of  nonparametric  curve  fitting  when  the  model  is  j 

y(t-)  = f(t.)+e^,  t^.  e [0,1],  where  the  sequence  of  {e^. } are  white  | 

Gaussian  noise  and  f is  only  assumed  to  be  "smooth"  has  been  solved:  ' 

t < 

A smoothing  spline  is  used  to  smooth  the  data,  and  the  method  of  ^ 

J 

generalized  cross-validation,  developed  under  this  grant,  is  used  to  ■ 

i 

^ estimate  the  optimum  degree  of  smoothing  from  the  data.  Since  the 

■i 

optimum  degree  of  smoothing  is  nearly  obtained,  numerical  differentiation  i 

i 

can  be  implemented  by  differentiating  the  smoothing  spline  even  for  | 

moderately  noisy  data.  Theoretical  properties,  discussion  of 
various  applications,  and  numerical  demonstrations  of  the  amazingly 
good  effectiveness  of  the  method  appear  in  [27,28,29,30].  A 

discussion  of  earlier  methods  appears  in  [31].  A new  solution  to  the  * 

problem  of  smoothing  irregularly  spaced  data  on  a surface,  using  an  i 

i 

^ extension  of  the  curve  smoothing  methods,  appears  in  [32].  | 

i 

Regression 

It  is  well  knov;n  that  Gauss-flarkov  or  minimum  variance  unbiased  I 

regression  estimates  can  be  very  bad  from  the  point  of  view  of  mean 
square  error,  and  so  the  use  of  biased  estimates  (to  reduce  the  mean 
I square  error)  is  becoming  increasingly  popular.  A new  technique  has 

a' 

! been  developed  (generalized  cross-validation)  for  choosing  the 

I ridge  parameter  in  a ridge  estimate.  This  method  can  also  be  used  to 

I select  a subset  when  subset  selection  regression  methods  are  used  and 

to  choose  the  principal  components  when  a principal  components  method 

i 

1 •*  : is  used.  In  fact,  the  technique  can  be  used  to  choose  between  the 

I i 

I 

J ^ 
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"best"  ridge,  subset  selection  or  principal  components  estimate.  The 

technique  has  the  good  properties  of  the  popular  Mallows  Cp-Cj^ 

statistics  but  unlike  Cp  or  can  be  used  (in  certain  circumstances) 

2 

when  there  are  no  degrees  of  freedom  for  estimating  o , that  is,  when 
the  number  of  candidate  variables  is  as  large  or  larger  than  the 
number  of  data  points.  These  results  appear  in  [33]. 

Approximate  Solution  of  Linear  Operator  Equations  Arising  in  Remote 
Sensing  Experiments 

Early  work  in  this  area  focussed  on  development  and  evaluation  of 
methods  for  obtaining  approximate  solutions  of  linear  operator  equations, 
including  integral  and  differential  equations,  using  projection  methods 
in  reproducing  kernel  Hilbert  spaces  [3^,35,36,37,38].  The 
mathematics  is  formally  similar  to  time  series  methods  for  continuous 
time,  time  series  problems,  although  the  applications  are  different. 

This  theoretical  work  laid  the  foundations  for  practical  methods  for 
solving  first  kind  integral  equations  when  the  data  are  noisy.  It  is 
very  frequently  necessary  to  solve  these  equations  numerically  in 
analyzing  experimental  data  in  physics,  meteorology,  biology, 
geophysics,  etc.  since  this  type  of  equation  models  the  typical  indirect 
sensing  experiment.  The  method  of  regularization  is  one  of  the  major 
techniques  for  solving  first  kind  integral  equations.  In  this  method 
the  user  must  choose  the  regularization  parameter,  which  controls  the 
bias-variance  or  stability-fidelity  tradeoff  of  the  solution.  Numerous 
ad  hoc  methods  have  been  proposed  for  choosino  this  parameter,  but  for 
many  years  it  has  been  an  open  question  how  to  choose  this  parameter 
from  the  data  without  prior  information.  This  question  has  been  answered 


r 


while  the  solver  was  supported  by  the  grant,  the  result  will  be 
published  shortly  [39]. 

Density  and  Spectral  Density  Estimation 

Density  estimates  are  commonly  used  as  descriptive  procedures 
for  data.  Also  questions  arising  in  a variety  of  applied  problems 
reduce  to  questions  concerning  properties  of  a density,  and, 
ultimately,  multivariate  density  or  likelihood  function  estimates  will 
provide  the  answer  to  certain  particularly  difficult  classification 
problems.  Several  new  density  estimates  have  been  proposed  and  their 
properties  obtained  [40,  41,  42,  43].  A fundamental  theoretical 
result  was  obtained  in  this  area.  It  is  the  establishment  of  the  |i 

fact  that  all  the  good  estimates  have,  asymptotically,  the  same 
mean  square  error  convergence  rates  under  comparable 
circumstances  if_  their  control  parameter,  which  controls  the  squared 
bias-variance  tradeoff  is  chosen  correctly  [41,  42,  43,  44].  The 
fundamental  practical  culmination  of  this  v/ork  is  the  development  of 
a density  estimation  technique  for  which  the  value  of  the  control 
parameter  which  minimizes  integrated  mean  square  error,  can  be 
estimated  from  the  data  [43].  This  method  can  also  be  used  to  choose 
the  optimal  (integrated  mean  Square  error)  bandwith  parameter  in  a 
window-type,  spectral  density  estimate  [28],  thus  answering  another 
long-open  question. 
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Optimal  Experimental  Design  for  System  Identification 

The  problem  of  selecting  the  optimal  input  for  estimating  the 
parameters  of  a stationary  linear  dynamic  model  was  solved.  [45,  46 
47]  This  work  is  frequently  referenced. 

k-Spacinqs  and  Goodness  of  Fit  Tests 

The  k-spacings  are  defined  as  the  distances  between  every  k-th 
order  statistics. 

In  [48]  the  asymptotic  distribution  of  certain  goodness  of  fit 
tests  based  on  k-spacings  is  obtained.  It  is  shown  that  tests  with 
k>l  are  more  pov/erful  asymptotically  than  the  usual  spacings  tests 
based  on  k=l . 

In  [49]  a relationship  is  established  between  the  empirical 
distribution  functions  (e.d.f.)  of  distributions  with  unknown  scale 
parameters  and  the  e.d.f.  of  independent  random  variables  subject  to 
scale  perturbations.  The  relation  is  exploited  to  obtain  significance 
points  for  some  tests  based  on  the  e.d.f.  of  the  k-spacings. 

Reports  [50]  and  [51]  are  the  first  two  of  a series  of  four 
reports  dealing  with  the  theory  of  k-spacings  and  their  applications  to 
goodness  of  fit  tests.  Report  [50]  contains  an  introduction  to  the 
series  and  shows  the  main  properties  and  applications  of  the 
Dirichlet  distribution.  Report  [51]  reviews  several  methods  used  to 
study  the  distribution  of  the  k-spacings  for  k=l  and  extends  them  to 
the  case  k>l.  In  particular,  an  important  theorem  of  LeCam  (1958)  is 
extended. 
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Stiff  differential  Equations 

These  equations  may  have  highly  oscillatory  solutions  which  are 
exceedingly  difficult  to  compute  numerically.  A completely  new 
approach  to  their  numerical  solution  has  been  proposed.  Instead  of 
trying  to  compute  the  exact  solution,  which  may  be  impossible,  a 
method  for  computing  the  low-pass-filtered  solution  was  developed  [52]. 

Control  Theory 

It  was  shown  that  the  general  linear  plant  problem  can  be 
solved  by  reproducing  kernel  Hilbert  space  techniques  when  arbitrary 
linear  constraints  which  can  be  expressed  as  continuous  linear 
functionals  are  imposed  [53].  A convergent  numerical  method  was 
obtained  for  minimizing  a quadratic  functional  (on  a reproducing 
kernel  Hilbert  space),  when  a continuous  family  of  linear  inequality 
constraints  are  imposed  [54]. 


-13- 


Inferences  from  Censored  Data 

Experiments  in  life  testing  often  must  be  terminated  before  all 
units  fail.  Consequently,  statistical  procedures  based  on  censored 
data  play  an  important  role  in  realiability  studies,  especially  of 
highly  reliable  components  or  systems.  Advances  were  made  on  several 
aspects  of  this  primary  problem  area  of  censored  life  tests. 

Locally  most  powerful  rank  tests  for  a general  parameter  in  a 
two-sample  single-censored  situation  are  derived  in  [1  ] and  the  work 
includes  an  investigation  of  asymptotic  power  and  efficiency. 
Asymptotically  optimal  inference  procedures  are  developed  in  [2] 
for  inferences  with  censored  data  using  only  very  weak  regularity 
conditions.  Our  investigations  with  censored  data  were  extended  to 
multiple-censored  situations  where  blocks  of  order  statistics  are 
censored.  In  [3],  several  important  relationships  have  been 
established  between  the  moments  of  the  hazard  rate,  or  its  multiple 


censoring  extensions,  and  the  terms  in  the  expressions  for  the 
uncensored  situation.  This  study  leads  to  an  exact  expression  for 
the  finite  sample  Fisher  information,  v/ereas  only  the  asymptotic 


expressions  were  previously  available.  The  ideas  developed  in  [1  ] 
are  extended  to  derive  locally,  most  powerful  tests  in  multiple-censored 


data  [6].  The  completeness  properties  of  both  parametric  and  non- 


parametric  families  of  life  distributions  under  censoring,  and 
implications  regarding  inferences  on  reliability  are  investigated  in 
[4  ].  The  asymptotic  sufficiency  of  the  ranks  is  established  in  [5] 
for  the  two  sample  situation  when  the  observations  are  censored  at  the 
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p rth  order  statistic.  Several 

concerning  optimal  tests. 


f 


important  conclusions  are  drawn 


Statistical  f-todelinq  and  Inference  for  Complex  Systems 

A major  thrust  of  our  research  was  directed  towards  reliability 
studies  of  multicomponent  systems  by  monitoring  the  failures  of 
components  and  modeling  their  interactive  behavior.  Inferences  with 
the  bivariate  reliability  model  of  Marshall  and  Olkin  are  investigated 
in  [7],  [8]  where  existence,  uniqueness  and  asymptotic  properties  of 
maximum  likelihood  estimator  are  studied  and  a uniformly  most 
powerful  test  of  independence  is  derived.  Important  advances  are 
made  in  [9],  [10]  in  terms  of  constructing  realistic  and  tractable 
models  for  the  reliability  of  a system  whose  components  have 
variable  strengths  and  which  are  subjected  to  random  stresses 
emanating  from  the  operating  environment.  In  a parametric  setting, 
the  UMVU  and  maximum  likelihood  estimators  as  well  as  confidence 
bounds  are  derived  in  [ 9]  for  the  reliability  of  an  s out  of  k system. 
Optimal  nonparametric  estimates,  their  large  sample  distribution  and 
efficiency  are  studied  in  [11].  Important  generalizations  of  the 
stress-strength  models  to  more  complex  systems  and  associated  inference 
problems  are  treated  in  [10]  which  also  includes  a Bayesian  technique 
for  certain  parametric  models.  In  addition,  optimal  estimation 
procedures  for  system  reliability  and  large  sample  confidence  bounds 
are  developed  in  [12]  for  the  experimental  situation  were  groups  of 
components  are  tested  under  common  stresses  and,  instead  of  individual 
strength  and  stress  measurements,  only  the  survivor  counts  are  recorded. 
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The  study  includes  an  evaluation  of  the  loss  of  efficiency  in  using 
the  count  data. 

Statistical  Techniques  for  the  Control  of  Quality 

Our  investigation  started  by  developing  large  sample  approximations 
to  the  distribution  of  Page's  cumulative  sum  (cusum)  test.  Considerable  j 

effort  was  expended  in  order  to  study  the  complete  run  length 
distribution  rather  than  just  the  mean  time  to  signal  'out-of-control*. 

The  basic  results  were  derived  by  relating  the  statistic  to  a functional 
of  a Wiener  process.  Next,  identifying  the  problem  with  that  of  a 
continuous  random  walk  with  one  reflecting  and  one  absorbing  barrier, 
we  obtained  the  distributional  results  in  [13]. 

Some  practical  consequences  of  high  import  then  followed  directly.  . 

The  same  mathematical  derivation  also  applies  when  the  observations  j 

I 

I 

exhibit  serial  correlation.  Thus,  in  [15],  [16]  we  were  able  to  study  i 

^ . 

the  change  in  distribution,  of  the  time  to  sianal,  under  auto-regressive  I 

! 

and  other  dependent  models.  Our  primary  conclusion  was  that  the  cusum  ; 

tests  are  not  robust  with  respect  to  departures  from  independence. 

The  use  of  cusum  tests  is  now  widespread  and  the  presence  of  . ! 

serial  correlation  so  common,  that  attention  must  be  drawn  to  the 
seriousness  of  this  lack  of  robustness. 

i 

Our  investigation  continued  in  [14]  where,  for  the  first  time,  an  | 


analytic  derivation  is  given  for  the  optimal  choice  of  a reference 
value  based  on  the  Wiener  approximation.  This  value  agrees  with  the 
rule  suggested  by  others  on  the  basis  of  f4onte  Carlo  studies.  Secondly 
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we  determined  the  effect  of  using  an  estimate  for  variance  and  the 

i : ' 

manner  in  which  the  use  of  an  estimate  interacts  with  the  choice  of  a 
reference  value. 

Pursuing  this  line  of  attack,  we  obtained  potentially  the  most  j 

important  breakthrough  concerning  the  development  of  statistical  I j 

techniques  for  monitoring  the  parameters  of  integrated  autoregressive 
moving  average  time  series  [17].  These  time  series  models  are  among 
the  most  widely  used  statistical  models  in  both  forecasting  situations 
and  basic  modeling  of  critical  system  performance  indicators.  In  either  t 

^ I 

situation,  environmental  shocks  can  cause  disruption  and  result  in  | 

■ I 

model  changes  which  must  be  detected  quickly.  Our  techniques  form  the 

first  steps  of  an  extensive  search  for  new  statistical  methods  :| 

vi 

designed  to  meet  this  purpose.  | 


Other  advances  in  research 

In  [18],  our  goal  was  to  study  classification  procedures  from  the 
point  of  view  of  finding  low  dimensional  hyperplanes  which  in  some 
sense  best  represent  the  p-variate  population  distributions  and  their 
samples.  Our  criterion  of  weighted  loss  of  distance  between  observations 
and  sample  centroids  shows  explicitly  the  manner  in  which  the  prior 
probabilities  enter.  We  establish  two  new  optimality  properties  for 
Fisher's  solution  to  finding  a lower  dimensional  representation. 

Moreover,  our  class  of  choices  including  Fisher's,  can  be  viewed  as  an 
orthogonal  transformation  of  the  standardized  principal  components  from 
the  common  covariance  matrix. 


i 

■ i ' 


In  [19],  a uniformly  most  powerful  unbiased  test  is  derived  for 
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testing  the  equality  of  inter-transition  time  distributions  for  a two- 
state  semi-Markov  process  under  an  inverse  sampling  scheme.  Tests  with 
regular  sampling,  their  exact  power  and  asymptotic  properties  are  also 
investigated  for  semi -Markov  processes.  These  processes  are  useful  in 
studying  system  failure  and  preventive  maintenance  policies,  consumer 
brand  switching  patterns  in  marketing  and  also  in  modeling  various 
stochastic  phenomena  in  other  fields. 

A purchase  incidence  model  is  introduced  in  [20]  where  the 
interpurchase  times  are  described  by  a two-parameter  inverse  Gaussian 
distribution  and  the  population  heterogeneity  is  modeled  by  the  natural 
conjugate  family.  This  model  is  more  flexible  than  the  exponential  and 
one-parameter  gamma  models  which  were  previously  used  for  purchase 
incidence. 

Significant  advances  were  made  to  the  understanding  of  the  large 
sample  behavior  of  posterior  distributions  in  [21]  and  [22].  These 
results  have  importance  in  almost  all  situations  where  Bayesian 
methods  of  inference  are  employed,  including  applications  to  life 
testing.  The  posterior  distribution  is  considered  in  the  general 
multi  parameter  situation  when  the  population  belongs  to  some 

-1/2 

exponential  family.  First  an  asymptotic  expansion  in  powers  of  n 
is  obtained,  for  the  posterior  distribution,  having  the  limiting 
normal  as  a leading  term.  The  following  terms  can  be  used  to  correct 
the  limiting  normal  approximation. 
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Besides  the  basic  result  on  the  expansion  of  the  posterior 
distribution  we  obtain  a similar  result  for  risk.  For  the  first 
time,  expansions  are  also  given  for  the  marginal  distributions  and 
conditional  distributions  for  the  parameters.  These  motivate  an 
interesting  discussion  of  asymptotic  independence.  We  conclude  with 
a novel  study  of  an  approximation  to  the  reoions  of  highest  posterior 
density  obtained  by  modifying  the  normal  ellipsoidal  regions  by 
using  a correction  term. 

Families  of  discrete  distributions  were  characterized  by 
probability  generating  functions  involving  hyperqeometric  or  confluent 
hypergeometric  functions.  Estimators  of  the  parameters  were  obtained 
and  their  behavior  examined  on  the  basis  of  asymptotic  relative 
efficiency  [23]. 

The  distortion  of  the  t-distribution  has  been  previously  examined 
when  the  parent  population  is  a mixture  of  normals.  During  the 
present  research  period  some  equal  probability  contours  were  computed 
showing  precisely  the  amount  of  distortion  from  a pre-specified  level 
of  significance  [25]. 

A solution  to  the  Behrens-Fisher  problem  was  previously  developed. 
During  the  present  research  period  the  results  were  revised  and  re- 
written in  a form  for  suitable  publication  [24]. 

I 
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